Letter-Phoneme Alignment: An Exploration

نویسندگان

  • Sittichai Jiampojamarn
  • Grzegorz Kondrak
چکیده

Letter-phoneme alignment is usually generated by a straightforward application of the EM algorithm. We explore several alternative alignment methods that employ phonetics, integer programming, and sets of constraints, and propose a novel approach of refining the EM alignment by aggregation of best alignments. We perform both intrinsic and extrinsic evaluation of the assortment of methods. We show that our proposed EM-Aggregation algorithm leads to the improvement of the state of the art in letter-to-phoneme conversion on several different data sets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reducing the Annotation Effort for Letter-to-Phoneme Conversion

Letter-to-phoneme (L2P) conversion is the process of producing a correct phoneme sequence for a word, given its letters. It is often desirable to reduce the quantity of training data — and hence human annotation — that is needed to train an L2P classifier for a new language. In this paper, we confront the challenge of building an accurate L2P classifier with a minimal amount of training data by...

متن کامل

Phoneme Segmenting Alignment with the Common Core Foundational Skills

In 2006, the easyCBM reading assessment system was developed to support the progress monitoring of phoneme segmenting, letter names and sounds recognition, word reading, passage reading fluency, and comprehension skill development in elementary schools. More recently, the Common Core Standards in English Language Arts have been introduced as a framework for outlining grade-level achievement exp...

متن کامل

Novel Entropy based moving average

The training of precise speech recognition models depends on accurate segmentation of the phonemes in a training corpus. Segmentation is typically performed using HMMs, but recent speech recognition work suggests that the transient acoustic features characteristic of manner-class phoneme boundaries (landmarks) may be more precisely localized using acoustic classifiers specifically designed for ...

متن کامل

Using Rules to Improve Letter to Sound Conversion of Names

This paper presents an investigation of the use of context sensitive rewrite rules for improving the performance of data driven letter to sound conversion, concentrating on the specific case of British names. Taking a practical point of view, emphasis is put on reduction of the worst phonetization errors, and on improving the maintainability of the system helping in database cultivation, and al...

متن کامل

Comparison of two tree-structured approaches for grapheme-to-phoneme conversion

Recently, we described a two-step self-learning approach for grapheme-to-phoneme (G2P) conversion [1]. In the first step, grapheme and phoneme strings in the training data are aligned via an iterative Viterbi procedure that may insert graphemic and phonemic nulls where required. In the second step, a Trie structure encoding pronunciation rules is generated. In this paper we describe the alignme...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010